Lecture 12 : Principal Components
نویسنده
چکیده
Consider the standard setting where we are given n points in d dimensions. Call these ~ x1, ~ x2, . . . , ~ xn. As before, our goal is to reduce the number of dimensions to a small number k. In principal component analysis (or PCA), we will model the data by a k-dimensional subspace, and find the subspace for which the error in this representation is smallest. Suppose k = 1. Then we want to approximate the data with a line. Assume the data is centered, so that ∑n i=1 ~ xi = 0, and that the line passes through the origin. Let the line correspond to direction ~ w, a unit vector. What is the error in approximating ~ xi with ~ w? We can use the perpendicular distance between the point ~ xi and the line represented by ~ w. It is easy to check that the perpendicular is given by ~ xi − (~ xi · ~ w)~ w, so that its squared length is
منابع مشابه
CS 168 : The Modern Algorithmic Toolbox Lecture # 8 : How PCA Works
Last lecture introduced the idea of principal components analysis (PCA). The definition of the method is, for a given data set and parameter k, to compute the k-dimensional subspace (through the origin) that minimizes the average squared distance between the points and the subspace, or equivalently that maximizes the variance of the projections of the data points onto the subspace. We talked ab...
متن کاملCS 168 : The Modern Algorithmic Toolbox
Principal components analysis (PCA) is a basic and widely used technique for exploring data. If you go on to take specialized courses in machine learning or data mining, you’ll certainly hear more about it. The goal of this lecture is develop your internal mapping between the linear algebra used to describe the method and the simple geometry that explains what’s really going on. Ideally, after ...
متن کاملPrincipal Fitted Components for Dimension Reduction in Regression
We provide a remedy for two concerns that have dogged the use of principal components in regression: (i) principal components are computed from the predictors alone and do not make apparent use of the response, and (ii) principal components are not invariant or equivariant under full rank linear transformation of the predictors. The development begins with principal fitted components [Cook, R. ...
متن کاملFisher Lecture: Dimension Reduction in Regression1,21,21,21,2
Beginning with a discussion of R. A. Fisher’s early written remarks that relate to dimension reduction, this article revisits principal components as a reductive method in regression, develops several model-based extensions and ends with descriptions of general approaches to model-based and model-free dimension reduction in regression. It is argued that the role for principal components and rel...
متن کاملFisher Lecture: Dimension Reduction in Regression1,2
Beginning with a discussion of R. A. Fisher’s early written remarks that relate to dimension reduction, this article revisits principal components as a reductive method in regression, develops several model-based extensions and ends with descriptions of general approaches to model-based and model-free dimension reduction in regression. It is argued that the role for principal components and rel...
متن کامل